Efficient time-series subsequence matching using duality in constructing windows

نویسندگان

  • Yang-Sae Moon
  • Kyu-Young Whang
  • Woong-Kee Loh
چکیده

In this paper, we propose a new subsequence matching method, Dual Match. Dual Match exploits duality in constructing windows and significantly improves performance. Dual Match divides data sequences into disjoint windows and the query sequence into sliding windows, and thus, is a dual approach of the one by Faloutsos et al. (Proceedings of the ACM SIGMOD International Conference on Management of Data, Seattle, Washington, 1994, pp. 419–429.) (FRM in short), which divides data sequences into sliding windows and the query sequence into disjoint windows. FRM causes a lot of false alarms (i.e., candidates that do not qualify) by storing minimum bounding rectangles rather than individual points representing windows to save storage space for the index. Dual Match solves this problem by directly storing points without incurring excessive storage overhead. Experimental results show that, in most cases, Dual Match provides large improvement both in false alarms and performance over FRM given the same amount of storage space. In particular, for low selectivities (less than 10@4), Dual Match significantly improves performance up to 430-fold. On the other hand, for high selectivities (more than 10@2), it shows a very minor degradation (less than 29%). For selectivities in between (10@4–10@2), Dual Match shows performance slightly better than that of FRM. Overall, these results indicate that our approach provides a new paradigm in subsequence matching that improves performance significantly in large database applications.r 2001 Elsevier Science Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Eecient Time-series Subsequence Matching Using Duality in Constructing Windows Eecient Time-series Subsequence Matching Using Duality in Constructing Windows

Subsequence matching in time-series databases is an important problem in data mining and has attracted a lot of research interest. It is a problem of nding the data sequences containing subsequences similar to a given query sequence and of nding the oosets of these subsequences in the original data sequences. In this paper, we propose a new approach (Dual Match) to subsequence matching that exp...

متن کامل

Duality-Based Subsequence Matching in Time-Series Databases

In this papec we propose a new subsequence matching method, DualMatch, which exploits duality in constructing windows and significantly improves performance. Qual Match divides data sequences into disjoint windows and the query sequence into sliding windows, and thus, is a dual approach of the one by Faloutsos et al. (FRM in short), which divides data sequences into sliding windows and the quer...

متن کامل

Linear Detrending Subsequence Matching in Time-Series Databases

Each time-series has its own linear trend, the directionality of a timeseries, and removing the linear trend is crucial to get the more intuitive matching results. Supporting the linear detrending in subsequence matching is a challenging problem due to a huge number of possible subsequences. In this paper we define this problem the linear detrending subsequence matching and propose its efficien...

متن کامل

Similar Subsequence Search in Time Series Databases

Finding matching subsequences in time series data is an important problem. The classical approach to search for matching subsequences has been on the principle of exhaustive search, where all possible candidates are generated and evaluated or all the terms of the time series in a data base are examined. As a result most of the subsequence search algorithms are cubic in nature with few algorithm...

متن کامل

A Subsequence Matching with Gaps-Range-Tolerances Framework: A Query-By-Humming Application

We propose a novel subsequence matching framework that allows for gaps in both the query and target sequences, variable matching tolerance levels efficiently tuned for each query and target sequence, and also constrains the maximum match length. Using this framework, a space and time efficient dynamic programming method is developed: given a short query sequence and a large database, our method...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Syst.

دوره 26  شماره 

صفحات  -

تاریخ انتشار 2001